Example-based bi-directional Chinese-English machine translation with semi-automatically induced grammars

نویسندگان

  • Kai-Chung Siu
  • Helen M. Meng
  • Chin-Chung Wong
چکیده

We have previously developed a framework for bi-directional English-to-Chinese/Chineseto-English machine translation using semi-automatically induced grammars from unannotated corpora. The framework adopts an example-based machine translation (EBMT) approach. This work reports on three extensions to the framework. First, we investigate the comparative merits of three distance metrics (Kullback-Leibler, ManhattanNorm and Gini Index) for agglomerative clustering in grammar induction. Second, we seek an automatic evaluation method that can also consider multiple translation outputs generated for a single input sentence based on the BLEU metric. Third, our previous investigation shows that Chinese-to-English translation has lower performance due to incorrect use of English inflectional forms a consequence of random selection among translation alternatives. We present an improved selection strategy that leverages information from the example parse trees in our EBMT paradigm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Example-based Bi-directional C Translation with Semi-automatic

We have previously developed a framework for bi-directional English-to-Chinese/Chinese-to-English machine translation using semi-automatically induced grammars from unannotated corpora. The framework adopts an example-based machine translation (EBMT) approach. This work reports on three extensions to the framework. First, we investigate the comparative merits of three distance metrics (Kullback...

متن کامل

Bilingual Methods for Adaptive Training Data Selection for Machine Translation

In this paper, we propose a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual corpus. In earlier work, we devised a data selection method based on semi-supervised convolutional neural networks (SSCNNs). The new method, Bi-SSCNN, is based on bitokens, which use bilingual...

متن کامل

Combining Linguistic and Statistical Methods for Bi-directional English Chinese Translation in the Flight Domain

In this paper, we discuss techniques to combine an interlingua translation framework with phrase-based statistical methods, for translation from Chinese into English. Our goal is to achieve high-quality translation, suitable for use in language tutoring applications. We explore these ideas in the context of a flight domain, for which we have a large corpus of English queries, obtained from user...

متن کامل

Bi-directional memory-based dialog translation: The KEMDT approach

Keywords: dialog translation, memory (example)-based translation, parallel marker passing, Korean language processing A bi-directional Korean/English dialog translation system is designed and implemented using the memory-based translation technique. The system KEMDT (Korean/English Memory-based Dialog Translation system) can perform Korean to English, and English to Korean translation using uni...

متن کامل

Inferring Maximally Invertible Bi-grammars for Example-Based Machine Translation

This paper discusses inference strategies of context-free bi-grammars for example based machine translation (EBMT). The EBMT system EDGAR is discussed in detail. The notion of invertible context-free feature bi-grammar is introduced in order to provide a means to decide upon the degree of ambiguity of the inferred bi-grammar. It is claimed that a maximally invertible bi-grammar can enhance the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003